Statistical assumption
Statistical assumptions are general assumptions about statistical populations.
Statistics, like all mathematical disciplines, does not generate valid conclusions from nothing. In order to generate interesting conclusions about real statistical populations, it is usually required to make some background assumptions. These must be made with care, because inappropriate assumptions can generate wildly inaccurate conclusions.
The most commonly applied statistical assumptions are:
- independence of observations from each other: This assumption is a common error.[1] (see statistical independence)
- independence of observational error from potential confounding effects
- exact or approximate normality of observations: The assumption of normality is often erroneous, because many populations are not normal. However, it is standard practice to assume that the sample mean from a random sample is normal, because of the central-limit theorem. (see normal distribution)
- linearity of graded responses to quantitative stimuli (see linear regression)
Types of assumptions
Statistical assumptions can be categorised into a number of types:
- Non-modelling assumptions. Statistical analyses of data involve making certain types of assumption, whether or not a formal statistical model is used. Such assumptions underlie even descriptive statistics.
- Population assumptions. A statistical analysis of data is made on the basis that the observations available derive from either a single population or several different populations, each of which is in some way meaningful. Here a "population" is informally a set of other possible observations that might have been made. The assumption here is a simple one, to the effect that the observer should know that the observations obtained are representative of the problem, topic or class of objects being studied.
- Sampling assumptions. These relate to the way in which observations have been gathered and may often involve an assumption of random selection of some type.[2]
- Modelling assumptions. These may be divided into two types:
- Distributional assumptions. Where a statistical model involves terms relating to random errors assumptions may be made about the probability distribution of these errors.[3] In some cases, the distributional assumption relates to the observations themselves.
- Structural assumptions. Statistical relationships between variables are often modelled by equating one variable to a function of another (or several others), plus a random error. Models often involve making a structural assumption about the form of the functional relationship here: for example, as in linear regression. This can be generalised to models involving relationships between underlying unobserved latent variables.
- Cross-variation assumptions. These assumptions involve the joint probability distributions of either the observations themselves or the random errors in a model. Simple models may include the assumption that observations or errors are statistically independent.
Checking assumptions
Given that the validity of conclusions drawn from a statistical analysis depend on the validity of any assumptions made, it is clearly important that these assumptions should be reviewed at some stage. In some instances, for example where data are lacking, this may have to be restricted to just making a judgement about whether an assumption is reasonable. This can be expanded slightly to try to judge what effect a departure from the assumptions might have. Where more extensive data are available, various types of procedure for statistical model validation are available, in particular for regression model validation.
See also
Notes
- ^
- Kruskal, William (December 1988). "Miracles and Statistics: The Casual Assumption of Independence (ASA Presidential address)". Journal of the American Statistical Association 83 (404): 929–940. JSTOR 2290117.
- ^ McPherson, 1990 (Section 3.3)
- ^ McPherson, 1990 (Section 3.4.1)
Bibliography
- Kruskal, William (December 1988). "Miracles and Statistics: The Casual Assumption of Independence (ASA Presidential address)". Journal of the American Statistical Association 83 (404): 929–940. JSTOR 2290117.
References
- McPherson, G. (1990) Statistics in Scientific Investigation: Its Basis, Application and Interpretation, Springer-Verlag. ISBN 0-387-97137-8